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Abstract 

Background:  Cryptic  species  complexes  are  common  among  anophelines.  Previous  phylogenetic  analysis  based 
on  the  complete  mtDNA  COI  gene  sequences  detected  paraphyly  in  the  Neotropical  malaria  vector  Anopheles 
marajoara.  The  "Folmer  region"  detects  a  single  taxon  using  a  3%  divergence  threshold. 

Methods:  To  test  the  paraphyletic  hypothesis  and  examine  the  utility  of  the  Folmer  region,  genealogical  trees 
based  on  a  concatenated  ( white  +  3'  COI  sequences)  dataset  and  pairwise  differentiation  of  COI  fragments  were 
examined.  The  population  structure  and  demographic  history  were  based  on  partial  COI  sequences  for  294 
individuals  from  14  localities  in  Amazonian  Brazil.  109  individuals  from  12  localities  were  sequenced  for  the  nDNA 
white  gene,  and  57  individuals  from  1 1  localities  were  sequenced  for  the  ribosomal  DNA  (rDNA)  internal 
transcribed  spacer  2  (ITS2). 

Results:  Distinct  A.  marajoara  lineages  were  detected  by  combined  genealogical  analysis  and  were  also  supported 
among  COI  haplotypes  using  a  median  joining  network  and  AMOVA,  with  time  since  divergence  during  the 
Pleistocene  (<100,000  ya).  COI  sequences  at  the  3'  end  were  more  variable,  demonstrating  significant  pairwise 
differentiation  (3.82%)  compared  to  the  more  moderate  2.92%  detected  by  the  Folmer  region.  Lineage  1  was 
present  in  all  localities,  whereas  lineage  2  was  restricted  mainly  to  the  west.  Mismatch  distributions  for  both 
lineages  were  bimodal,  likely  due  to  multiple  colonization  events  and  spatial  expansion  (-798  -  81,045  ya).  There 
appears  to  be  gene  flow  within,  not  between  lineages,  and  a  partial  barrier  was  detected  near  Rio  Jari  in  Amapa 
state,  separating  western  and  eastern  populations.  In  contrast,  both  nDNA  data  sets  (white  gene  sequences  with  or 
without  the  retention  of  the  4th  intron,  and  ITS2  sequences  and  length)  detected  a  single  A.  marajoara  lineage. 
Conclusions:  Strong  support  for  combined  data  with  significant  differentiation  detected  in  the  COI  and  absent  in 
the  nDNA  suggest  that  the  divergence  is  recent,  and  detectable  only  by  the  faster  evolving  mtDNA.  A  within 
subgenus  threshold  of  >2%  may  be  more  appropriate  among  sister  taxa  in  cryptic  anopheline  complexes  than  the 
standard  3%.  Differences  in  demographic  history  and  climatic  changes  may  have  contributed  to  mtDNA  lineage 
divergence  in  A.  marajoara. 


Background 

About  1.8  million  species  are  known  on  Earth,  includ¬ 
ing  more  than  1  million  insects,  250,000  higher  plants 
and  69,000  fungi  [1].  The  Amazon  comprises  much  of 
this  biodiversity  and  is  considered  the  largest  gene 
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reserve  in  the  world,  with  an  estimated  14%  of  all 
plant  and  animal  species  within  its  boundaries  [2]. 
Because  speciation  is  not  always  accompanied  by  mor¬ 
phological  change  [3],  the  true  number  of  biological 
species  is  likely  to  be  greater  than  current  estimates 
[4].  Genetic  analysis  plays  an  increasingly  important 
role  in  identifying  changes  in  population  structure 
and  elucidating  taxonomic  status  and  phylogenetic 
relationships. 
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Genetically  divergent  but  morphologically  cryptic  spe¬ 
cies  have  been  described  in  many  aquatic  organisms  [5], 
birds  [6,7]  and  insects  [8,9],  particularly  among  mosqui¬ 
toes  in  the  Dipteran  genus  Anopheles  [10-12].  The  Neo¬ 
tropical  Albitarsis  Complex  contains  at  least  six  species, 
only  some  of  which  are  documented  malaria  vectors: 
Anopheles  marajoara,  a  local  and  regionally  important 
vector  in  lowland  rainforest  [13-15],  Anopheles  jancon- 
nae  (previously  Anopheles  albitarsis  E)  implicated  in 
local  transmission  in  Amazonian  savannah  [16];  and 
Anopheles  deaneorum,  which  appears  to  consist  of  mul¬ 
tiple  species  [17],  and  is  a  potential  vector,  as  demon¬ 
strated  by  comparative  susceptibility  laboratory  studies 
[18,19], 

Various  methods  have  been  used  to  investigate  species 
delimitation  and  identifications  in  the  Albitarsis  Com¬ 
plex  [11,20,21],  culminating,  most  recently,  in  the  recog¬ 
nition  of  six  species  [22],  but  see  Bourke  et  al  [17]. 
Genealogical  analyses  of  complete  mtDNA  COI  (Cyto¬ 
chrome  oxidase  I)  sequences  found  that  A.  marajoara  is 
paraphyletic,  and  may  consist  of  at  least  two  phyloge¬ 
netic  species  [10]  or  lineages  [3,23],  one  of  which  may 
be  A.  janconnae. 

The  mitochondrial  genome  has  been  used  extensively 
in  studies  of  molecular  evolution  [24]  and  the  COI  gene 
has  resolved  evolutionary  relationships  among  closely 
related  species  for  a  wide  range  of  taxa  [25,26],  including 
insects  [27,28]  and  cryptic  species  complexes  [29,30].  The 
Folmer  region,  648-bp  at  the  5’  end  of  the  COI  mito¬ 
chondrial  gene,  has  emerged  as  the  standard  barcode 
region  [31,32].  Interspecific  divergence  within  insects 
almost  always  exceeds  3%  and  this  value  has  been  used 
as  a  speciation  threshold  [33].  The  true  test  of  DNA  bar¬ 
code  precision  would  include  comparisons  with  sister 
species  [34].  The  utility  of  DNA  barcoding  among  insects 
is  still  being  debated,  because  of  success  in  revealing 
cryptic  species  [31,35,36]  on  one  hand,  and  the  inability 
to  reliably  detect  species  boundaries  on  the  other  [36-38]. 
An  estimated  20%  failure  rate  has  been  noted  at  the  spe¬ 
cies  level  due  to  non-monophyly  [39],  which  increases 
among  insects  due  to  overlapping  ranges  of  intra-  and 
interspecific  sequence  divergences  [40].  Together  these 
findings  suggest  that  the  threshold  level  could  be  set 
lower  than  3%  to  minimize  false  negatives  [36]. 

Accurate  morphological  identification  of  the  adult 
females  of  species  within  the  Albitarsis  Complex  is  vir¬ 
tually  impossible,  and  the  ITS2  has  become  a  recognized 
molecular  tool  for  identification  [21,41,42].  ITS2  varia¬ 
tion  is  low  within  a  species  due  to  homogenization  and 
fixation  while  the  overall  fragment  length  is  generally 
variable  between  species  [20,43-45].  As  such  it  can 
usually  resolve  phylogenetic  relationships  at  different 
taxonomic  levels,  and  detect  recently  diverged  taxa  such 
as  sibling  species  of  mosquitoes  [46]. 


Members  of  a  species  are  rarely  distributed  homoge¬ 
neously  in  space,  and  population  subdivision  can  occur 
in  response  to  geographical  boundaries,  social  behaviour 
and  genetic  variation  [47].  Patterns  in  biological 
sequence  data  that  arise  from  ancestry  can  be  useful  in 
determining  the  structure  and  boundaries  of  a  given 
species  [48].  The  objective  of  this  study  was  to  test  the 
hypothesis  of  paraphyly  [10]  using  a  combined  data  set 
( white  gene  +  COI)  and  to  evaluate  the  population 
structure  of  A.  marajoara  to  address  the  following  ques¬ 
tions:  1)  is  the  proposed  paraphyletic  status  supported; 
2)  what  is  the  level  of  genetic  differentiation  between 
populations;  3)  can  lineages  of  A.  marajoara  be  distin¬ 
guished  by  the  Barcode  of  Life  (BOLD)  3%  species 
threshold;  and  4)  can  genetic  differentiation  be 
explained  by  demographic  phenomena,  geographic 
boundaries  and  (or)  natural  selection. 

Methods 

Mosquito  collection 

Adult  female  mosquitoes  from  seven  localities  spanning 
roughly  890  kilometers  of  a  transect  along  the  Amazon 
River  were  collected  using  Shannon  traps  adjacent  to 
breeding  habitats  between  Amazonas  state  near  Urucara 
along  the  Amazon  river  and  the  tributary  of  Rio  Paru 
entering  the  Amazon  river.  They  were  also  collected 
from  Itaituba,  south  of  the  Amazon  River,  Para  State,  in 
2005  (collection  protocol  approved  by  the  New  York 
State  Department  of  Health  IRB  and  Brazilian  Instituto 
Evandro  Chagas,  Belem,  Para  state  Ethical  Committee). 
Mosquitoes  were  identified  morphologically  using  the 
key  of  Deane  et  al.  [49]  as  A.  albitarsis  s.l.  Previously 
extracted  specimens  collected  between  1995  and  2001, 
using  a  human  landing  catch  protocol  approved  by  the 
University  of  Vermont  and  the  Brazilian  Instituto  Evan¬ 
dro  Chagas,  Belem,  Para  state  Ethical  Committee,  from 
seven  localities  in  northeastern  Para  and  Amapa  states 
of  Brazil  and  from  Itaituba,  south  of  the  Amazon  River 
in  Para  State,  were  also  included  [50].  Genomic  DNA 
was  extracted  using  the  Puregene  DNA  isolation  kit 
(Gentra  Systems)  and  maintained  at  Griffin  Laboratory 
at  -80°C.  Ten  to  27  mosquitoes  per  site  were  selected 
for  DNA  amplification  and  sequencing  of  the  COI 
(Table  1).  A  subset  of  four  to  15  samples  for  the  white 
gene  and  four  to  seven  samples  for  ITS2  were  amplified 
and  sequenced  (Table  1). 

Amplification  and  sequencing 

A  1200-bp  fragment  of  the  COI  gene  was  amplified  using 
the  forward  primer  UEA3  and  the  reverse  primer  UEA10 
[28].  An  822-bp  fragment  from  the  3’  end  using  PCR  pri¬ 
mers:  2195D  (5’-TGATTYTTTGGTCAT CCNG AAGT-3’; 
a  modification  of  Cl-J-2195  for  amplification  in  Collem- 
bola  [51])  and  FLY10A  (5’-AATGCACTAATCTGCCA 
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Table  1  Anopheles  albitarsis  s.l.  collection  information 


Sample  Size 


Site 

Locality 

State 

Coordinates 

COI 

white 

ITS2 

1 

Urucara 

Amazonas 

S2.32  W57.46 

25 

15 

7 

2 

Paratins 

Amazonas 

S2.37  W56.39 

25 

7 

6 

3 

Campina 

Para 

S2.52  W55.28 

25 

12 

5 

4 

Itaituba 

Para 

S4.10  W55.50 

27 

12 

4 

5 

Santarem 

Para 

S2.26  W54.45 

25 

10 

6 

6 

Monte  Alegre 

Para 

S2.01  W54.05 

25 

9 

6 

7 

Uruara 

Para 

S2.08  W53.38 

25 

9 

6 

8 

Rio  Paru 

Para 

SI. 28  W52.44 

25 

13 

7 

9 

Serra  do  Navio 

Amapa 

N0.53  W52.00 

14 

0 

0 

10 

Santana 

Amapa 

S0.20  W51.11 

18 

9 

2 

11 

Macapa 

Amapa 

N0.20  W51.30 

21 

4 

2 

12 

Tartarugalzinho 

Amapa 

N1.19  W50.57 

15 

4 

0 

13 

Amapa 

Amapa 

N2.10  W50.54 

10 

0 

0 

14 

Salvaterra 

Para 

S0.46  W48.31 

14 

5 

2 

Total 

294 

107 

53 

TATTAG-3’;  a  modification  of  TL2-N-3014  for  amplifica¬ 
tion  in  Simuliid  flies  [52])  was  previously  amplified  for  the 
northeastern  collection  samples  [50].  Individual  PCR  reac¬ 
tions  were  preformed  using  Ready-To-Go-PCR  bead 
(Amersham  Pharmacia/Biotech,  NJ,  USA)  and  run  on  a 
PTC  100  or  200  series  thermal  cycler  (Biorad,  Inc.),  or  a 
PTC-100  (MJ  Research,  Inc.),  using  the  conditions  stipu¬ 
lated  in  Mirabello  and  Conn  [53].  The  Applied  Genomics 
Technology  Core  (Wadsworth  Center)  carried  out  the 
sequencing.  The  forward  and  reverse  COI  sequences  were 
aligned  using  Sequencher  3.0  (Gene  Codes  Corps,  MI, 
USA),  grouped  together  by  sight  and  trimmed  in  PAUP*, 
version  4.0  [54] .  The  complete  overlap  of  both  COI  primer 
sets  created  a  488-bp  fragment  (Additional  file  1). 

An  800-bp  fragment  of  the  white  gene  was  amplified 
using  the  W2R  and  WF  primers,  with  the  PCR  condi¬ 
tions  as  reported  in  Mirabello  and  Conn  [55].  A  500-bp 
fragment  of  the  ribosomal  ITS2  was  amplified  using  the 
primers  18S  and  28S  with  the  parameters  in  Li  and 
Wilkerson  [20].  The  PCR  products  were  cleaned, 
sequenced  and  aligned  creating  a  496-bp  fragment  of 
the  white  gene  and  361-bp  of  the  ITS2  with  complete 
forward  and  reverse  overlap  for  each  marker.  Unique 
sequences  for  all  markers  are  available  in  GenBank 
[GenBank:  HQ026025-HQ026113]. 

Phylogenetic  relationship 

All  A.  albitarsis  s.l.  sequences  were  identified  to  species 
by  multiple  gene  (COI,  white,  ITS2)  comparisons  to 
those  deposited  in  GenBank:  A.  albitarsis  [DQ076207/ 
DQ076208;  AY956299/AY956300;  AF462386 / 

AF462387],  A.  oryzalimnetes,  formerly  A.  albitarsis  B 


[DQ076210-DQ076215;  AY956297/AY956298;  U92333], 
A.  marajoara  [DQ076216/DQ076217-DQ076221/ 
DQ076225;  AY956295/AY956296;  U92334],  A.  dea- 
neorum  [DQ076226/DQ076229;  AY956301/AY956302; 
AF46 175 1/AF46 1752],  A.  janconnae,  [DQ076231/ 
DQ076232]  and  A  albitarsis  F  [DQ228314/DQ228315]. 

Genealogical  trees  were  estimated  using  the  concate¬ 
nated  ( white  +  COI )  data  set.  ITS2  sequences  were 
excluded  from  these  analyses  because  of  relatively  lim¬ 
ited  sample  size.  Maximum  Parsimony  trees  were  gener¬ 
ated  in  PAUP*,  with  one  hundred  replicates  of  a 
heuristic  search  performed  with  an  initial  random  step¬ 
wise  addition  of  sequences  and  TBR  branch  swapping. 
Branch  support  was  estimated  from  1,000  replicates  of  a 
bootstrap  search.  Bayesian  inference  (BI)  analysis  was 
performed  with  Mr.  Bayes  version  3.1  [56,57],  parti¬ 
tioned  by  gene,  using  the  model  of  nucleotide  substitu¬ 
tion  (TPM3uf+G  and  TIM1+I)  that  best  fit  the  white 
gene  and  COI  respectively,  determined  with  jModelTest 
[58,59].  The  settings  were  two  simultaneous,  indepen¬ 
dent  runs  of  the  Markov  Chain  Monte  Carlo  (MCMC) 
for  4  million  generations,  sampling  every  1,000  genera¬ 
tions  with  a  ‘burnin’  of  25%.  The  outgroup  A.  albimanus 
in  the  Albimanus  Section  of  Nyssorhynchus  (unpub¬ 
lished  sequence)  was  chosen  based  upon  its  phylogenetic 
position  in  Sallum  et  al  [60].  Estimates  of  time  to  coa¬ 
lescence  were  calculated  for  the  COI  fragment  only  and 
compared  using  0S  values  [61]  and  BEAST  [62]. 

Genetic  variation 

Genetic  structure  of  two  lineages  was  examined  by  analy¬ 
sis  of  molecular  variance  (AMOVA),  a  method  of  esti¬ 
mating  population  variance  directly  from  molecular  data, 
using  Arlequin  version  3.1.1  [63].  In  addition,  spatial  ana¬ 
lysis  of  molecular  variance  (SAMOVA),  version  1.0  [64] 
was  used  to  cluster  the  488-bp  COI  sequence  data  into 
genetically  and  geographically  homogeneous  populations. 
SAMOVA  generates  F  statistics  (Fsc,  FST,  FCT )  using  the 
AMOVA  approach,  into  K  groups  to  maximize  the 
between  group  variation.  SAMOVA  estimates  were  com¬ 
puted  for  I<  =  2-13  with  1,000  simulated  annealing  steps 
from  each  of  100  sets  of  initial  starting  conditions. 

A  comparison  of  the  pairwise  divergences  calculated 
by  DnaSP  between  COI  lineages  using  the  648-bp  5’  end 
and  the  488-bp  3’  end  tested  the  utility  of  the  Folmer 
region  and  the  3%  species  threshold  using  10  to  13  indi¬ 
viduals  from  each  lineage,  three  closely  related  taxa  and 
an  outgroup  (Anopheles  darlingi).  Anopheles  darlingi  is 
more  appropriate  than  A.  albimanus  for  sister  taxa  com¬ 
parisons  as  both  it,  and  A.  marajoara,  are  in  the  Argyri- 
tarsis  Section  [65].  A  species  screening  threshold  (SST) 
[66]  may  be  more  appropriate  and  can  be  calculated  as 
lOx  the  mean  intraspecific  variation  for  a  group  [67]. 
The  standard  sequence  threshold  was  calculated  and 
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examined  for  Nyssorhynchus  using  4-58  archived  Gen- 
Bank  sequences  for  known  species. 

Population  structure  and  demographic  history 

A  statistical  parsimony  network  estimated  genealogical 
relationships  among  COI  haplotypes  with  a  93%  sequence 
identity  and  95%  identity  for  each  of  the  nuclear  markers 
using  TCS  1.13  [68].  Homoplasy  in  all  networks  was 
resolved  using  the  algorithm  estimation  rules  in  Crandall 
and  Templeton  [69]. 

The  differentiation  and  polymorphism  statistics  for 
COI  sequences  by  species  or  lineage  and  locality  were 
computed  in  DnaSP,  Version  4.0  [70],  and  the  hypoth¬ 
esis  of  strict  neutrality  was  examined  using  the  statistics 
Dt  [71],  D  and  F  [72],  and  R2  [73]  which  are  based  on 
the  frequencies  of  segregating  sites,  and  Fu’s  Fs  [74], 
based  on  the  haplotype  distribution.  Tajima’s  D  and 
both  Fu  and  Li’s  D *  and  F*  are  the  most  effective  tests 
to  detect  background  selection,  whereas  Fu’s  Fs  and  R2 
are  among  the  most  powerful  tests  to  detect  population 
expansion  [73].  All  neutrality  tests  were  calculated  using 
DnaSP,  Version  4.0  or  MEGA  version  3.1  [75]. 

The  mismatch  distribution  (simulated  in  Arlequin)  is  a 
frequency  distribution  of  the  observed  number  of 


pairwise  nucleotide  differences.  The  shape  of  the  distri¬ 
bution  is  highly  informative  and  able  to  differentiate 
between  a  population  expansion  and  equilibrium  [76], 
whereas  the  smoothness  (raggedness  statistic)  distin¬ 
guishes  the  fit  of  the  empirical  data  to  the  model  [77]. 
Statistically  significant  differences  between  observed  and 
simulated  distributions  were  evaluated  with  the  sum  of 
square  deviations  (SSD)  to  reject  the  hypothesis  of 
demographic  expansion  [78]. 

Results 

Phylogenetic  relationship 

The  estimated  MP  and  BI  trees,  each  with  42  parsimo¬ 
niously  informative  sites,  indicated  two  distinct  lineages 
of  A.  marajoara  with  varying  levels  of  support  (Figure  1). 
The  more  derived  lineage  1  remains  ambiguous  (only 
moderate  support,  0.66  and  52%);  however  it  does  have 
the  broadest  distribution  and  contains  samples  from  near 
(~56  km)  the  type  locality  on  Marajo  Island,  Para  state, 
Brazil  [79].  Lineage  2,  in  contrast,  is  more  restricted 
geographically  and  has  higher  support  (1.00  and  89%, 
Figure  1).  The  relatively  low  BI  branch  support  (0.66)  for 
A.  marajoara  lineage  1  coupled  with  the  moderate  sup¬ 
port  among  some  of  the  subdivisions  suggests  this  lineage 
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is  paraphyletic  and  that  haplotypes  that  would  otherwise 
increase  lineage  support  are  missing  (extinct  or  not 
sampled).  Within  both  lineages,  smaller  subdivisions,  not 
related  to  geography,  were  apparent  between  the  haplo¬ 
types  that  are  separated  by  seven  mutations  in  lineage  1 
(between  haplotypes  11  and  13)  and  10  mutations  in  line¬ 
age  2  (haplotypes  48  and  49),  illustrated  in  the  network 
(Figure  2). 

The  estimated  time  to  coalescence  for  the  two  COI 
lineages  was  obtained  using  the  equations,  6S  -  2Nefj, 
[29]  and  4Ne  =  years  since  coalescence  [80].  Based  upon 


the  488-bp  fragment,  0S  is  7.530  (SD  =  1.857).  There¬ 
fore,  the  estimate  of  the  time  to  coalescence  is  232,476  - 
384,716  years  ago,  in  the  Pleistocene.  A  similar  estimate 
using  BEAST,  the  standard  arthropod  mtDNA  mutation 
rate  of  2.3%  per  million  years  [81],  an  applied  HKY 
model  with  gamma  distribution  assuming  constant  size 
and  a  relaxed  molecular  clock,  yielded  an  estimate  of 
3.07  mya.  The  discrepancy  between  coalescent  estimates 
is  likely  due  to  the  10  generations  per  year  that  is  fac¬ 
tored  into  the  Walton  equation  [29].  Although  the 
incorporation  of  fossil  calibration  points  can  generate 


Figure  2  Parsimony-based  haplotype  networks  of  68  haplotypes  from  488-bp  of  the  COI  gene  sequenced  from  294  specimens  of  A. 
marajoara  s.l.  Statistical  parsimony  network  of  the  COI  haplotypes  (see  Additional  file  2)  from  the  14  Amazonian  localities  based  on  a  488-bp 
fragment  and  93%  sequence  similarity.  Dark  grey  represents  A.  marajoara  lineage  1;  light  grey,  lineage  2;  and  black  indicates  A  oryzalimnetes. 
Each  line  represents  a  single  mutational  event;  black  dots  represent  unobserved,  possibly  extinct,  haplotypes.  Haplotype  numbers  are  indicated 
in  bold,  *  denotes  the  inclusion  of  known  GenBank  sequences  identical  to  haplotypes  1,  2,  3,  5,  42,  46,  47,  and  50  identified  as  A  marajoara. 
Haplotypes  62-67  (light  grey)  are  A  oryzalimnetes  from  GenBank  only;  and  S  represents  the  corresponding  collection  sites  (see  Table  1). 
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more  reliable  divergence  estimates  [82],  no  data  were 
available  because  the  mosquito  fossil  record  is  too 
recent  and  rare  [83]. 

Genetic  variation 

The  between  lineage  variance  for  the  Folmer  region  of 
the  two  lineages  of  A.  marajoara  (2.94%)  is  marginally 
below  the  3%  BOLD  threshold.  However,  it  is  compar¬ 
able  to  among  species  comparisons  within  the  Albitarsis 
Complex  (e.g.,  A.  albitarsis  s.s.  vs.  A.  oryzalimnetes, 
2.64%;  Additional  file  2).  The  3’  end  of  the  COI  (above 
diagonal,  Additional  file  2)  is  less  conserved,  resulting  in 
higher  estimates  of  divergence.  Among  closely  related 
Nyssorhynchus  spp.  (Table  2),  the  average  intraspecific 
variation  is  0.0116,  and  suggests  that  a  SST  of  11.6% 
might  suffice  within  this  subgenus.  However,  this  esti¬ 
mate  is  limited  as  only  7  out  of  32  currently  recognized 
species  [65]  were  available.  Intraspecific  nucleotide  dif¬ 
ferences  among  Nyssorhynchus  species  were  consistently 
less  than  2%,  indicating  that  the  higher  estimate  (0.021) 
of  combined  A.  marajoara  lineages  likely  represents  an 
overlooked  species  complex. 

AMOVA  indicated  that  61.94%  (p  =  0.0000)  of  the  var¬ 
iance  was  explained  by  between-group  variation  of  A. 
marajoara  lineages  1  and  2.  SAMOVA  analysis,  provid¬ 
ing  resolution  for  lineage  1  only,  defined  two  groups 
that  correspond  to  northeastern  and  western  Amazonia, 
with  61.36%  regional  variation  (Figure  3).  Several  COI 
haplotypes  were  found  in  both  the  northeastern  and 
western  populations  of  lineage  1  indicating  at  least  some 
gene  flow  across  the  geographic  barrier.  High  variation, 
especially  between  lineages  1  and  2,  and  between  geo¬ 
graphic  populations  in  both  COI  and  white  genes,  is 
strongly  supported  by  population  differentiation  statis¬ 
tics,  although  the  G$t  values  were  not  significantly  dif¬ 
ferent  (Table  3).  Furthermore,  the  Kt  values  were  much 
greater  for  the  COI  gene  (11.9  and  6.96),  compared  with 
the  white  gene  (1.0). 


Population  structure  and  demographic  history 

Analysis  of  mitochondrial  data  from  294  samples  among 
seven  riverine  and  seven  non-riverine  localities  in  Ama¬ 
zonas,  Para,  and  Amapa  states,  Brazil  (Figure  3)  led  to 
the  discovery  of  two  co-occuring,  but  distinct  A.  mara¬ 
joara  lineages  that  are  unable  to  be  connected  using  sta¬ 
tistical  parsimony  (Figure  2).  Therefore,  a  median 
joining  network  that  uses  alternate  algorithms  to  remove 
homoplasy  was  conducted  and  indicated  that  a  mini¬ 
mum  of  13  mutational  steps  separate  the  lineages, 
shown  in  Figure  2.  Seven  fixed  mutations  were  esti¬ 
mated  between  lineages  based  on  DnaSP.  Additionally, 
A.  oryzalimenetes  was  identified  from  Itaituba  and  from 
Macapa,  a  new  record  (Figure  3). 

COI  haplotypes  were  detected  (Figure  2).  Of  the  A. 
marajoara  haplotypes,  26  (44.8%)  were  shared  among 
localities,  and  32  (55.2%)  were  unique  (Additional  file 
3).  The  most  common  haplotypes  were  1  (n  =  16),  2  ( n 
-  26),  3  (n  =  20)  and  5  (n  =  78)  of  A.  marajoara  lineage 
1,  and  haplotypes  43  ( n  =  9)  and  46  («  =  9),  50  («  = 
32),  and  56  ( n  =  11)  of  A.  marajoara  lineage  2  (Figure 
2).  There  was  a  relatively  high  proportion  of  singletons 
(32,  55.17%)  in  the  A.  marajoara  lineages  combined, 
with  24/42  for  lineage  1,  and  9/16  for  lineage  2.  Overall, 
both  A.  marajoara  lineages  were  indicative  of  a  category 
II  pattern  [26]  characterized  by  pronounced  genetic 
gaps  between  some  branches  and  the  co-distribution  of 
principle  lineages  over  a  wide  area,  which  could  theore¬ 
tically  arise  in  a  species  with  large  evolutionary  Ne 
(effective  population  size)  and  high  gene  flow.  Lineage  1 
contains  a  star  shaped  node  surrounding  haplotype  5, 
with  short  branches  and  an  excess  of  singleton  muta¬ 
tions,  predominantly  from  the  northeastern  localities, 
which  suggests  a  demographic  expansion,  background 
selection  or  selective  sweep  [74,84].  In  contrast,  lineage 
2  indicates  balancing  selection  with  its  longer  branches, 
missing  haplotypes  and  near  equal  distribution  of  shared 
haplotypes  and  single  mutations,  consistent  with  a  signal 


Table  2  COI  K2P  intraspecific  nucleotide  difference  of  Nyssorhynchus  species  using  the  fragment  of  the  Folmer  region 
available  in  GenBank  with  current  A.  marajoara  and  individual  lineage  comparisons  below  the  double  line. 


Species 

N 

Fragment  length 

Intraspecific  difference  mean  (SD) 

GenBank  Accession 

A.  albitarsis  s.s 

6 

703 

0.010  (±  0.003) 

DQ076204-DQ076209 

A.  oryzalimnetes 

6 

703 

0.004  (±  0.002) 

DQ076210-DQ076215 

A.  deaneorum 

4 

703 

0.015  (±  0.003) 

DQ076226/227,  DQ076229/230 

A.  goeldii 

16 

493 

0.014  (±  0.003) 

EU84831 3-EU848328 

A.  braziliensis 

58 

529 

0.015  (±  0.003) 

DQ913858-DQ913877 

A.  darlingi 

36 

460 

0.012  (±  0.003) 

DQ298209-DQ298244 

/I.  dunhami 

4 

493 

0.010  (±  0.003) 

EU848329-EU848332 

A.  marajoara  s.l 

10 

703 

0.024  (±  0.003) 

DQ07621 6-DQ076225 

Lineage  1 

7 

703 

0.016  (±  0.002) 

DQ07616,  DQ0762 18-20,  DQ076222-24 

Lineage  2 

3 

703 

0.014  (±  0.002) 

DQ076217,  DQ076221,  DQ076225 
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Figure  3  Distribution  of  collection  localities  of  A.  marajoara  s.l.  and  A.  oryzalimentes  Map  of  South  America  indicating  the  location  of  the 
14  collection  localities,  lineage  distributions  and  the  spatial  grouping  derived  from  the  SAMOVA  analysis.  Dotted  line  corresponds  to  SAMOVA 
defined  groupings  (eastern  and  western  Amazon),  based  on  sequence  similarity  and  geographic  distance  for  A.  marajoara  lineage  1  among  all 
localities.  F CT  corresponds  to  between  group  variation  and  Fsc  indicates  variation  within  groupings. 


of  an  older  lineage.  It  could  be  argued  that  haplotype  50 
(lineage  2,  Figure  2)  and  the  5  singletons  that  arise  from 
it  also  constitute  a  star-shaped  node,  which  suggest 
recent  expansion.  Lineage  2  appears  to  be  restricted  to 
the  westernmost  localities. 

Statistical  parsimony  networks  of  nuclear  data  ( white 
gene  and  ITS2)  containing  comparative  numbers  of  spe¬ 
cimens  of  both  CO/-defined  lineages  identified  a  single 
A.  marajoara  lineage  (Figure  4A,  B).  Haplotype  clusters 
(Figure  4A)  correspond  primarily  to  northeastern  and 
western  populations  with  a  few  outliers  including  haplo¬ 
type  L  and  two  individuals  in  haplotype  A.  Both  lineages 
retained  the  4th  intron  in  the  white  gene,  which  is 
absent  from  all  other  Albitarsis  Complex  members  [11]. 
Additionally,  the  ITS2  length  was  identical  (361-bp)  in 
all  sequences,  independent  of  the  lineage.  Copy  number 
in  microsatellite  regions  at  positions  118  (GT)  and  273 
(GA)  and  the  2-bp  indel  at  position  271  were  not  con¬ 
gruent  with  either  lineage  or  geography,  but  consistent 


with  findings  by  Li  and  Wilkerson  [20,42].  Genetic  poly¬ 
morphism  analyses  of  the  COI  sequences  indicated  that 
haplotype  diversity  was  greater  among  lineage  1  popula¬ 
tions,  whereas  lineage  2  localities  exhibited  slightly 
higher  nucleotide  diversity  (Additional  file  3).  In  lineage 
1,  there  was  greater  diversity  among  the  western  local¬ 
ities  (1-8),  compared  to  northeastern  Amazon  (9-14). 
The  white  gene  exhibited  lower  nucleotide  diversities 
overall  compared  to  COI,  and  some  differences  in  diver¬ 
sity  between  populations,  particularly  in  Tartarugalzinho 
(locality  12),  where  haplotypes  D  and  T  are  present, 
separated  by  11  mutation  steps  (Additional  file  4).  Mod¬ 
erate  nucleotide  diversity  in  the  COI  data  coupled  with 
the  single  A.  marajoara  lineage  detected  with  the  white 
gene  may  reflect  ancestral  polymorphism  as  a  result  of 
co-occurence  and  haplotype  mixing  [85]. 

The  high  COI  haplotype  diversity  combined  with  rela¬ 
tively  low  nucleotide  diversity  in  both  lineages  (Addi¬ 
tional  file  3)  suggests  a  population  bottleneck  followed 
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Table  3  Inter-  and  intra-population  differentiation  of  A. 
marajoara  lineages  and  populations  on  either  side  of  the 
SAMOVA  generated  boundary. 


COI  (488-bp  fragment) 

white  (496  bp) 

Lineage  1  vs. 

Lineage  1  E  vs. 

East  vs.  West 

2  (W  =  289) 

W  (N  =  209) 

Hs 

0.81757*** 

0.71449*** 

0.65635*** 

Ks* 

1 .66659*** 

1.05701*** 

0.99335*** 

Z* 

9.19220*** 

8.65007*** 

7.70914*** 

Snn 

0.99301*** 

0.8298*** 

0.82814*** 

x2 

282.612*** 

148.927*** 

59.991*** 

Gst 

0.08536 

0.13799 

0.05827 

kt 

1 1 .92573 

6.95565 

1.03981 

N,  number  of  total  individuals  compared;  HSl  genetic  differentiation  based  on 
haplotype  data;  Ks*,  differentiation  based  on  sequence  data;  Z*  rank  statistic 
to  analyze  sequence  similarity;  Snn,  measures  how  often  the  nearest 
neighbours  of  sequences  are  found  in  the  same  population;  c2,  genetic 
differentiation  based  on  allele  frequencies;  GST,  genetic  differentiation;  Kv 
average  number  of  nucleotide  differences;  **  significance,  p  <  0.02;  *** 
significance,  p  <  0.0001 

by  an  expansion.  The  negative  values  for  three  neutrality 
tests  in  lineage  1  (Table  4)  may  reflect  an  excess  of  rare 
polymorphisms  consistent  with  either  positive  selection 
or  an  increase  in  population  size,  although  only  one  of 
these  tests  ( Fs )  was  significant.  In  contrast,  the  positive 
values  in  lineage  2  indicate  an  excess  of  intermediate- 
frequency  alleles  in  a  population,  which  can  result  from 
either  balancing  selection  or  population  bottlenecks 
[86].  Only  the  northeastern  subdivision  of  lineage  1 
revealed  strong  support  for  a  population  expansion 
event.  The  comparison  between  the  northeastern  and 
western  populations  within  lineage  1  supports  the 
SAMOVA  findings  of  a  geographic  barrier,  depicted  in 
Figure  3. 

The  mismatch  distribution  model  for  sudden  expan¬ 
sion  was  marginally  significant  for  both  lineages.  Both 
exhibited  a  bi-modal  distribution  (Figure  5)  with  a  com¬ 
plete  separation  and  significant  raggedness  values,  sug¬ 
gesting  constant  population  size  [87].  However,  an 
alternative  interpretation  would  be  that  there  were  two 
expansions  [88]  dated  to  the  Pleistocene  (lineages  1  and 
2,  142,203  (3,678-220,163)  and  114,713  (8,545-230,635) 
ybp,  respectively).  The  lineage  1  northeastern  population 
(Figure  2)  expanded  more  recently,  approximately  5,000 
years  ago,  during  the  Holocene. 

Discussion 

The  combined  sequence  results  support  divergence  in 
A.  marajoara  depicting  separate  lineages,  distinct  from 
A.  janconnae.  The  more  ancestral  lineage  2  appears 
monophyletic,  whereas  moderate  substructure  within 
lineage  1  suggests  paraphyly.  Nuclear  markers,  however, 
consistently  depict  a  single  A.  marajoara  lineage. 


Discrepancies  between  mtDNA  and  nDNA  are  not 
enough  to  refute  speciation,  for  example  the  newly 
recognized  species  A.  janconnae  and  A.  marajoara  form 
a  single  lineage  using  the  protein  coding  nuclear  white 
gene  and  ITS2  sequences  [20,21],  but  complete  COl 
sequences  [10]  indicate  monophyly  of  each  taxon,  and 
morphometries  analysis  depicts  A.  janconnae  as  a  sepa¬ 
rate  taxon  in  the  complex  [22].  Nuclear  mutation  rates 
are  generally  slower  than  mitochondrial  ones  [89],  but 
see  Hellberg  [90],  thus  the  lack  of  divergence  detected 
in  the  nuclear  sequences  in  A.  marajoara  in  the  present 
study  suggests  either  the  differentiation  is  restricted  to 
the  mitochondrial  genome,  or  it  is  recent  and  not  yet 
visible  in  the  nuclear  genome.  Given  that  lineage  1  has  a 
broader  distribution  and  includes  samples  from  near  the 
type  locality,  lineage  1  would  be  more  appropriately 
named  A.  marajoara  s.s.  The  status  of  lineage  2  is  not 
yet  resolved;  it  may  be  a  new  cryptic  species  belonging 
to  the  complex  as  suggested  by  Wilkerson  et  al.  (2005) 
from  Manaus,  Brazil  [41]  and  confirmed  by  JF  Ruiz 
(personal  communication)  using  the  COI  barcode  region 
of  the  Albitarsis  subgroup  across  South  America. 

Genetic  variation  between  lineages  was  explained  by 
61.94%  between  group  differences.  Pairwise  divergence 
estimates  based  on  the  488-bp  fragment  confirmed 
3.82%  divergence  between  lineages  (Table  3),  which  is 
comparable  to  the  values  observed  among  well-known 
Albitarsis  Complex  species  (for  example,  3.52%  between 
A.  albitarsis  s.s  and  A.  oryzalimnetes) .  These  data  may 
support  cryptic  or  incipient  speciation  [91].  The  effec¬ 
tiveness  of  the  Folmer  region  and  the  BOLD  threshold 
of  3%  are  questionable  for  at  least  some  species  of  Nys- 
sorhynchus  because  there  is  only  moderate  support 
between  known  species  A.  oryzalminetes  and  A.  albitar¬ 
sis  s.s.  If  the  average  intraspecific  variation  of  the  Folmer 
region  for  the  subgenus  (11.6%)  is  used  as  a  species 
threshold,  the  pairwise  divergences  of  all  sister  taxa 
(including  A.  darlingi  and  A.  albimanus )  do  not  exceed 
this  standard.  Such  a  high  SST  is  likely  to  result  in  false 
negatives  even  among  recognized  species,  therefore 
separate  intraspecific  values  and  interspecific  diver¬ 
gences  were  examined  for  resolution,  where  intraspecific 
COI  variation  was  <2%  and  interspecific  variation  was 
between  6-10%,  comparable  to  other  anopheline  species 
differences  [[92,93],  unpublished  data]. 

Pleistocene  divergence  in  mtDNA  has  also  been 
detected  for  a  wide  range  of  insect  taxa  across  Central 
and  South  America,  including  L.  longipalpis  [94],  H. 
erato  [81],  R.  prolixus  [95],  and  mosquitoes  A.  darlingi 
[53]  and  A.  albimanus  [85]  suggesting  climatic  changes 
may  be  a  common  force  driving  Neotropical  speciation. 
Pleistocene  divergence  coupled  with  paleoclimate 
records  of  the  Miocene  describing  the  world  as  warmer 
and  exhibiting  greater  humidity  and  precipitation  [96]; 
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Figure  4  Parsimony-based  haplotype  networks  of  nuclear  white  and  ITS2  for  A.  marajoara.  Statistical  parsimony  network  (95%):  A)  107 
white  gene  sequences  and  their  corresponding  haplotypes  and  B)  53  ITS2  sequences.  There  was  no  distinction  between  lineages;  grey 
haplotypes  are  all  from  western  localities  (1-8  Figure  1;  Table  1)  and  light-colored  haplotypes  are  all  from  northeastern  localities  (9-14). 


Table  4  Results  of  neutrality  tests  based  on  COI 
sequences  of  A.  marajoara  from  Amazonian  Brazil. 


A.  marajoara 

N 

Dt 

D* 

F* 

Fs 

«2 

Lineage  1 

209 

0.92686 

-1.44039  -0.54355 

-9.637* 

0.1147 

Western 

122 

0.58594 

-0.61517  -0.17709 

-6.611* 

0.1101 

Eastern 

87 

-2.40421** 

-1.70059  -2.5166* 

-15.498** 

0.0214** 

Lineage  2 

80 

2.52378* 

0.52692 

1.50581 

1.920 

0.1 843 

N,  sample  size;  DT>  Tajima's  D;  D,  Fu  and  Li's  D;  F,  Fu  and  Li's  F;  FS/  Fu's  Fs;  R2, 
Ramos-Onsin's  and  Rosa's;  *  significance,  p  <  0.50;  **  significance,  p  <  0.02. 


[97]  support  the  more  ancestral  depiction  of  lineage  2. 
Pleistocene  climatic  changes,  including  lower  tempera¬ 
ture,  precipitation  and  carbon  dioxide  levels  than  those 
exhibited  today,  may  have  contributed  to  the  divergence 
of  lineage  1  [98,99]. 

Lineages  co-occur  along  the  Amazon  River  with  no 
obvious  species  diversity  gradients,  although  data  indi¬ 
cate  several  “high  diversity  localities”  including  San- 
tarem,  which  is  a  recognized  hotspot.  However,  this  may 
be  an  artefact  resulting  from  the  relative  logistical  ease 


McKeon  et  al.  Malaria  Journal  2010,  9:271 
http://www.malariajournal.eom/content/9/1/271 


Page  10  of  13 


of  access  to  these  localities  and  the  many  studies  that 
have  been  conducted  there  [100].  The  geographic  bar¬ 
rier  detected  within  A.  marajoara  lineage  1  is  in  the 
same  region  as  the  mtDNA  divisions  between  northeast¬ 
ern  and  central  western  Amazon  populations  in  A.  dar- 
lingi  [101],  A.  nuneztovari  [102]  and  Atta  species  ants 
[103].  With  similar  distributions  and  geographic  bound¬ 
aries  noted  in  both  mosquitoes  and  ants,  it  seems  unli¬ 
kely  that  the  boundary  is  the  result  of  dispersal  abilities. 
The  detection  of  shared  haplotypes  in  the  present  study 
on  both  sides  of  the  barrier  indicates  a  permeable 
boundary,  suggesting  the  more  geographically  restricted 
lineage  2  may  be  an  artefact  of  incomplete  sampling,  or 
perhaps  a  result  of  climate  variation. 


Climate,  particularly  rainfall,  is  a  strong  descriptor  of 
broad-scale  species-richness  patterns  in  the  tropics  [104]. 
Vasconcelos  et  al  [105]  noted  a  restricted  geographic 
distribution  of  ants  in  the  western  part  of  the  basin,  which 
to  a  great  extent  reflected  a  west  to  east  gradient 
of  decreasing  rainfall.  Additionally,  Sombroek  [106] 
explained  an  intricate  pattern  of  continuous  rainfall  in  the 
western  interior  of  the  Amazon,  in  contrast  to  the  coasts, 
with  a  pronounced  dry  season  containing  a  dry  belt  or 
corridor  occurring  near  Rio  Jari,  Amapa  state  (near  the 
SAMOVA  barrier  for  A.  marajoara  lineage  1).  Miocene 
flooding  [103]  and  subsequent  Pleistocene  climatic  events 
may  explain  the  porous  nature  of  the  barrier. 

Recent  deforestation  and  habitat  fragmentation  result¬ 
ing  from  the  1967  initiation  of  the  single  largest  forest 
plantation  in  Latin  America,  an  agro-forestry  venture 
known  as  the  Jari  project  [107]  in  northern  Brazil  could 
have  contributed  to  an  increase  in  A.  marajoara  abun¬ 
dance  in  this  region.  Clearing  of  forest,  in  combination 
with  an  increase  in  human  and  domestic  animal  host 
abundance,  may  have  led  to  an  A.  marajoara  population 
increase  in  Macapa,  the  capital  of  Amapa  -state,  in 
northeastern  Amazonian  Brazil  [15]. 

Among  the  riverine  localities,  the  hypothesis  of  sud¬ 
den  population  expansion  could  not  be  rejected, 
although  the  mismatch  distributions  were  smooth  and 
bimodal  for  both  lineages.  The  reduced  genetic  variation 
of  A.  marajoara  lineage  1  in  the  northeastern  Amazon 
region  coupled  with  the  unimodal  distribution  and  the 
star-like  pattern  of  the  haplotype  network  are  consistent 
with  a  rapid  population  expansion  in  the  northeast. 
A.  marajoara  tends  to  breed  in  marshy  sunlit  pools  and 
may  have  expanded  into  the  northeast  where  savannah 
and  agricultural  habitats  would  be  favorable.  The  popu¬ 
lation  expansion  occurred  during  the  Holocene,  possibly 
indicating  a  recent  colonization.  This  explanation  is  sup¬ 
ported  by  the  expansion  of  savannah  during  periods  of 
fire-associated  drought  and  extinction-recolonization  of 
rainforest  tree  populations  in  lowland  areas  during  the 
Holocene  [108-110].  Moderate  gene  flow  in  lineage  1 
may  hamper  local  vector  control  in  Amazonas,  Para  and 
Amapa  states  because  of  potential  reinvasion  and  the 
spread  of  insecticide  resistant  genes  [111]. 

Conclusions 

Anopheles  marajoara  is  an  important  vector  in  the 
savannah  areas  of  South  America  [13],  and  with  some  of 
the  highest  incidences  of  malaria  occurring  in  the  states 
of  Para  and  Amapa  [14,15,112]  it  is  likely  that  A.  mara¬ 
joara  will  continue  to  increase  its  regional  importance 
in  malaria  transmission.  Additional  studies  will  be 
needed  to  determine  whether  the  subdivision  resulting 
in  the  two  co-occurring  lineages  in  A.  marajoara  is 
strictly  the  product  of  genetic  variation  and  evolutionary 
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processes  impacted  by  geographical  barriers,  or  varia¬ 
tions  in  ecology  and  behaviour  that  may  have  lead  to 
niche  divergences  and  microallopatry  [113-116].  If  both 
lineages  are  implicated  as  vectors,  the  characteristics  of 
their  breeding  sites,  their  behaviours  and  their  migration 
history  could  be  used  to  predict  changes  in  malaria 
transmission  patterns  in  the  Amazon  basin  [15]  and 
other  endemic  regions,  and  provide  useful  information 
for  targeted  vector  control  strategies. 

Additional  material 


Additional  file  1:  Complete  COI  with  fragment  orientation  and 
overlap.  Representative  schematic  of  the  COI  gene,  Folmer  region  and 
fragments;  light  blue  bar  indicating  the  region  amplified  by  primers 
UEA3  and  UEA10;  the  purple  bar  depicting  fragment  previously  amplified 
from  primers  2195D  and  C1-J-2195.  *  denotes  the  piece  of  the  COI  that 
was  used  for  the  phylogeography  and  population  structure  analysis. 

Additional  file  2:  DXy  with  Jukes  Cantor  pairwise  divergence 
between  species  and  lineages  based  upon  the  648-bp  Folmer 
region  (below)  and  the  488-bp  COI  fragment  (above). 

Additional  file  3:  Description  of  shared  COI  haplotypes  and  genetic 
polymorphism  statistics  for  A.  marajoara  lineages  and  A. 
oryzalimnetes  N,  the  number  of  sequences;  1-Zfj2,  haplotype  diversity;  11, 
nucleotide  diversity;  SD,  standard  deviation;  P,  polymorphic  sites;  and  K, 
average  number  of  differences.  Underlining  indicates  shared  haplotypes 
and  bold  indicates  diversity  among  lineages  or  species. 

Additional  file  4:  Description  of  shared  white  haplotypes  and 
genetic  polymorphism  statistics  for  A.  marajoara  s.l. 
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